Abstract 



This article connects the theory of extremal doubly stochastic mea- 
sures to the geometry and topology of optimal transportation. 

We begin by reviewing an old question 111) of Birkhoff in prob- 
ability and statistics [1] , which is to give a necessary and sufficient con- 
dition on the support of a joint probability to guarantee extremality 
among all measures which share its marginals. Following work of Dou- 
glas, Lindenstrauss, and Benes and Stepan, Hestir and Williams [TS] 
found a necessary condition which is nearly sufficient; we relax their 
subtle measurability hypotheses separating necessity from sufficiency 
slightly, yet demonstrate by example that to be sufficient certainly re- 
quires some measurability. Their condition amounts to the vanishing 
of 7 outside a countable alternating sequence of graphs and antigraphs 
in which no two graphs (or two antigraphs) have domains that over- 
lap, and where the domain of each graph / antigraph in the sequence 
contains the range of the succeeding antigraph (respectively, graph). 
Such sequences are called numbered limb systems. Surprisingly, this 
characterization can be used to resolve the uniqueness question for 
optimal transportation on manifolds with the topology of the sphere. 
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1 Introduction 

An n x n doubly stochastic matrix refers to a matrix of non-negative entries 
whose columns and rows each sum to 1. The doubly stochastic matrices form 
a convex subset of all n x n matrices — in fact a convex polytope, whose 
extreme points are in bijective correspondence with the n\ permutations on 
n-letters, according to a theorem of Birkhoff [3] and von Neumann [32] • For 
example, the 3x3 doubly stochastic matrices, 
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form a 4-dimensional polytope with 6 vertices. Shortly after proving this 
characterization, Birkhoff jH Problem 111] initiated the search for a infinite- 
dimensional generalization, thus stimulating a line of research which remains 
fruitful even today. 

A doubly stochastic measure on the square refers to a non-negative Borel 
probability measure on [0,1] 2 whose horizontal and vertical marginals both 
coincide with Lebesgue measure A on [0,1]. The set of doubly stochastic 
measures forms a convex set we denote by T(A, A) (which is weak-* compact 
in the Banach space dual to continuous functions C([0, l] 2 ) normed by their 
suprema || • H^). A measure is said to be extremal in T(A, A) if it cannot be 
decomposed as a convex combination 7 = (1 — t)^ + tji with < t < 1 
and 70,71 € T(A, A), except trivially with 70 = 71. Since the Krein-Milman 
theorem asserts that convex combinations of extreme points are dense (in any 
compact convex subset of a topological vector space, Figure [T]), it is natural 
to want to characterize the extreme points of T(A, A). Another motivation for 
such a characterization is that every continuous linear functional on T(A, A) 
is minimized at an extreme point. Whether or not this extremum is uniquely 
attained can be an interesting question: in Figure [T] the horizontal coordinate 
is minimized at a single point but maximized at two extreme points (and 
along the segment joining them). 




Figure 1: Krein-Milman asserts a compact convex set K can be reconstructed 
from its extreme points (denoted here by solid circles • and solid lines — ) 

Motivated by applications like the optimization problem just mentioned, 
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we prefer to formulate the question in slightly greater generality, by replacing 
the two copies of ([0,1], A) with probability spaces (X,p) and (Y,v), where 
X and Y are each subsets of a complete separable metric space, and \x and 
v are Borel probability measures on X and Y respectively. This widens 
applicability of the answer to this question without increasing its difficulty 
Letting T(fi, v) denote the Borel probability measures onlxF having \x and 
v for marginals, we wish to characterize the extreme points of the convex set 
v). Ideally, as in the finite-dimensional case, this characterization would 
be given in terms of some geometrical property of the support of the measure 
7 in X x Y . Indeed, if fi = Y^Li m i^xi and v = Y^j=x n j^yj are finite, our 
problem reduces to characterizing the extreme points of the convex set A of 
m x n matrices with prescribed column and row sums: 



A matrix (ay) is well-known to be extremal in A if and only if it is acyclic, 
meaning for every sequence a il j 1 , . . . , a ik Jfc of non-zero entries occupying k > 2 
distinct columns and k distinct rows, the product a ix ^ . . . cti k _ 1 j k a ik j 1 must 
vanish - - see Figure [2] or Denny [9], where the terminology aperiodic is 
used. Similarly, a set S C X x Y is acyclic if for every k > 2 distinct 
points {xi, . . . , x k } C X and {yx, . . . ,yk} C Y, at least one of the pairs 
(xij/i), {xi,y 2 ), (£2,2/2), • • • , (a*-i,2/fc), {xk,Vk), {x^Vx) lies outside of S. 



Figure 2: In an acyclic matrix the product of x's and o's must vanish 
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A functional analytic characterization of extremality was supplied by 
Douglas pU] and by Lindenstrauss [21]: it asserts that 7 is extremal in T(fi, v) 
if and only if L l (X, dfi) © L 1 (Y, dv) is dense in L X {X x Y, dy). Although this 
result is a useful starting point, it is not quite the characterization we desire 
for applications, since it is not easily expressed in terms of the geometry of 
the support of 7. Significant further progress was made by Benes and Stepan, 
who showed every extremal doubly stochastic measure vanishes outside some 
acyclic subset S C X x Y [2J. Hestir and Williams refined this condition, 
showing that it becomes sufficient under an additional Borel measurability 
hypothesis which, unfortunately, is not always satisfied [TH]. Some of the 
subtleties of the problem were indicated already by Losert's counterexam- 
ples [22J. The difficulty of the problem resides partly in the fact that any 
geometrical characterization of optimality must be invariant under arbitrary 
measure-preserving transformations applied independently to the horizontal 
(abscissa) and vertical (ordinate) variables. 

In this manuscript we review this line of research, clarifying the nature 
of the gap separating necessity from sufficiency and pointing out that it can 
be narrowed slightly by replacing the Borel cr-algebra with suitably adapted 
measure-completions. We conclude by describing an application to the ques- 
tion of uniqueness in optimal transportation, which is one of the original 
and most important examples of an infinite-dimensional program [TS], and 
appears naturally in applications [27J [31 J. It arises when one wants to use 
a continuum of sources to supply a continuum of sinks (modeled by fi and v 
respectively) as efficiently as possible. The question addressed is to identify 
cost functions c(x, y) on the product space X xY whose minimum expected 
value against measures in T(/i, v) is uniquely attained. When X and Y are 
differentiable manifolds and c G C X [X x Y), to guarantee uniqueness it turns 
out to be sufficient that y\ 7^ y<i imply x G X — > c(x,yi) — c(x, 7/2) has 
no critical points, except perhaps for a single global maximum and a single 
global minimum. This generalizes to some compact manifolds X a criterion of 
Gangbo [12J, Carlier [7J, Levin [20J and Ma, Trudinger and Wang [23], which 
asserts that the absence of critical points implies uniqueness; (their condition 
further implies that almost every source supplies a single sink, thus solving 
another transportation problem first posed by Monge [25], which our condi- 
tion does not do). When satisfied, our criterion implies that the manifold X, 
if compact, has the topology of the sphere. Uniqueness, however, remains 
an interesting open question for compact manifolds which are not topologi- 
cal spheres. This surprising application was first developed in an economic 
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context by Chiappori, McCann, and Nesheim [S|. 



2 Measures on graphs are push-forwards 

Before recalling the characterization of interest, let us develop a bit of nota- 
tion in a simpler setting, and a key argument that we shall require. Impatient 
or knowledgeable readers can skim the present section and proceed directly 
to the final sections below. 

Let X and Y be subsets of complete separable metric spaces, and fix a 
non-negative Borel measure \i on X. Suppose / : X — > Y is //-measurable, 
meaning f~ l (B) is in the cx-algebra completion of the Borel subsets of X 
with respect to the measure /z, whenever B is relatively Borel in Y . Then a 
Borel measure on Y is induced, denoted /#// and called the push-forward of 
\x through /, and given by 

{Un)[B]:=n[r\B)\ (1) 

for each Borel B C Y . Defining the projections n x (x, y) = x and 7r y (a;, y) — y 
on X x Y, this notation permits the horizontal and vertical marginals of a 
measure 7 > on X x Y to be expressed as 7r^7 and 71^7 respectively. 

The next lemma shows that any measure supported on a graph can be 
deduced from its horizontal marginal. It improves on Lemma 2.4 of [H] and 
various other antecedents, by using an argument from Villani's Theorem 5.28 
pH] to extract /z-measurability of / as a conclusion rather that a hypothesis. 
As work of, e.g., Hestir and Williams [12] implies, although measures on 
graphs are extremal in r(//, u), the converse is far from being true; this 
peculiarity is an inevitable consequence of the infinite divisibility of (X,/j). 

Lemma 2.1 (Measures on graphs are push-forwards) Let X and Y be 

subsets of complete separable metric spaces, and 7 > a a-finite Borel 
measure on the product space X x Y. Denote the horizontal marginal of 
7 by jj := tt^7- If 7 vanishes outside the graph of f : X — > Y, meaning 
{{x,y) G X x Y I y 7^ f{x)} has zero outer measure, then f is fi-measurable 
and 7 = (id x x /)#//. 

Proof. Since outer-measure is subadditive, it costs no generality to as- 
sume the subsets X and Y are in fact complete and separable, by extending 7 
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in the obvious (minimal) way. Any er-finite Borel measure 7 is regular and a- 
compact on a complete separable metric space; e.g. p. 255 of [H] or Theorem 
1-55 of [30]. Since 7 vanishes outside Graph(/) := {(x, f(x)) \ x G X}, there 
is an increasing sequence of compact sets Ki C K i+ i C Graph(/) whose union 
= lim^oo Ki contains the full mass of 7. Compactness of Ki C Graph(/) 
implies continuity of / on the compact projection X^ := n x (Ki). Thus 
the restriction of / to X^ := ^(K^) is a Borel map whose graph 
Koo = Graph (/oo) is a a-compact set of full measure for 7. We now ver- 
ify that 7 and [idx^ X /oo)#A* assign the same mass to each Borel rectangle 
UxV C XxY. Since (Ux 1/)nGraph(/ 00 ) = ((C/n/" 1 ^)) x Y) n Graph (j^) 
we find 

7 (UxV) = 7 ((f/n/ 00 1 (v))xy) 
= ^(unf-^V)), 

proving 7 = (z^Xoo x /oo)#A t - Taking U — X\ X^ and V = K shows X \ X^ 
is /i-negligible. Since idx X / differs from the Borel map i^x^ x /oo only 
on the /^-negligible complement of the cr-compact set X^, we conclude / is 
/i-measurable and 7 = (idx x /)#A* as desired. ■ 

The preceding lemma shows that any measure concentrated on a graph is 
uniquely determined by its marginals; 7 is therefore extremal in r(7r^7, tt#7)- 
As the results of the next section show, the converse is far from being true. 

3 Numbered limb systems and extremality 

In this section we adapt Hestir and Williams [T3] notion of a numbered limb 
system to X x Y. Using the axiom of choice, Hestir and Williams deduced 
from the acyclicity condition of Benes and Stepan [2 J that each extremal dou- 
bly stochastic measure vanishes outside some numbered limb system. Con- 
versely, they showed that vanishing outside a number limb system is sufficient 
to guarantee extremality of a doubly stochastic measure, provided the graphs 
(and antigraphs) comprising the system are Borel subsets of the square. Our 
main theorem gives a new proof of this converse in the more general setting 
of subsets X x Y of complete separable metric spaces, and under a slightly 
weaker measurability hypothesis on the graphs and antigraphs. A simple 
example shows that some measurability hypothesis is nevertheless required. 
In the next section, we shall see how this converse is germane to the question 
of uniqueness in optimal transportation. 
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Given a map / : D — > Y on D C X, we denote its graph, domain, range, 
and the graph of its (multivalued) inverse by 

Graph(/):= {(x, f(x)) \ x E D}, 

Dom/:= 7r x (Graph(/)) = D, 

Ran/:= 7r y (Graph(/)), 

Antigraphy) := {(f(x),x) \ x G Dom/} cYxX. 

More typically, we will be interested in the Antigraph(g) C X x Y of a map 
g : D C y — ►A'. 
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Figure 3: The subsets Ik need not be connected; in this numbered limb 
system they are represented as connected sets for visual convenience only. 

Definition 3.1 (Numbered limb system) Let X and Y be Borel subsets 
of complete separable metric spaces. A relation S C XxYis a numbered limb 
system if there is a countable disjoint decomposition of X = U°^ i2i+i and of 
Y = U°Z l2i with a sequence of maps f 2 i : Dom(/ 2 j) C Y — > X and /W+i : 
Dom(/ 2i+ i) C X — > Y such that S = Graph(/ 2i _i) U Antigraphy), 
with Dom(/ fe ) U Ran(/ fe+1 ) C Ik for each k > 0. The system has (at most) 
N limbs i/Dom(/ fc ) = for all k> N. 

Notice the map /o is irrelevant to this definition though J is not; we 
may always take Dom(/ ) = 0, but require Ran(/i) C Iq. The point is 
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the following theorem and its corollary, which extends and relaxes the result 
proved by Hestir and Williams for Lebesgue measure \x = v = A on the 
interval X = Y = [0, 1]. In it, v) denotes the set of non-negative Borel 
measures onlxF having \x = n^'j and v = 71^7 for marginals. As in the 
preceding lemma, we say 7 vanishes outside of S C X x Y if 7 assigns zero 
outer measure to the complement of S in X x Y. 

Theorem 3.2 (Numbered limb systems yield unique correlations) 

Let X and Y be subsets of complete separable metric spaces, equipped with 
a -finite Borel measures n on X and uonY. Suppose there is a numbered 
limb system S = U"^ Graph(/2i-i) U Antigraph(/ 2 j) with the property that 
Graph(/ 2 i-i) and Antigraph(/ 2 j) are '-/-measurable subsets of X xY for each 
i > 1 and for every 7 G r(//, v) vanishing outside of S. If the system has 
finitely many limbs or fJ>[X] < 00, then at most one 7 e v) vanishes 
outside of S. If such a measure exists, it is given by 7 = J2T=i Ik where 

72i_i = (id x x /2i-i)#772»-i, 72i = (hi x id Y )#r]2i, (2) 

V2i-1 =(»- *il2i) , V2i =\V- ^#l2i+l ) ■ (3) 

V * ' Dom \ * / Dom Ua 



Here f\ is measurable with respect to the rjk completion of the Borel a-algebra. 
If the system has N < 00 limbs, 7^ = for k > N, and rjk and 7^ can be 
computed recursively from the formulae above starting from k = N . 

Proof. Let S = Graph(/ 2 i-i) U Antigraph(/ 2 j) be a numbered limb 
system whose complement has zero outer measure for some a-fmite measure 
< 7 G v). This means that I k D Dom/ fc gives a disjoint decomposition 
of X = U~ J 2 j + i and of Y — U^ J 2i , and that Ran(/ fc ) C Ik-i for each k > 1. 
Assume moreover, that Graph (/21) and Antigraph(/ 2 j_i) are 7-measurable 
for each i > 1. We wish to show 7 is uniquely determined by /i, z/ and S. 

The graphs Graph(/ 2 i-i) are disjoint since their domains hi-i are dis- 
joint, and the antigraphs Antigraphy*) are disjoint since their domains J 2i 
are. Moreover, Graph(/ 2 j_i) is disjoint from Antigraph(/ 2j ) for all i,j > 1: 
Ran(/ 2 j_i) C I21-2 prevents Graph(/ 2 j_i) from intersecting Antigraph(/ 2 j_ 2 ) 
unless j = i since the domains hj-2 are disjoint, and Graph(/ 2 i_i) can- 
not intersect Antigraph(/ 2i _ 2 ) since Dom(/ 2i _i) C hi-i is disjoint from 
Ran(/ 2i _ 2 ) C I 2i -?,- 

Let 7fc denote the restriction of 7 to Antigraphy) for k even and to 
Graph(/fe) for k odd. Then 7 = J2lk by our measurability hypothesis, 
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and 7fc restricts to a Borel measure on X x Dom/fc if fc is even, and on 



Dom/fc x Y if k odd. Defining the marginal projections fXk = ir^k and 



u k = 77^7^, setting r\ k = v k \ik even and r\ k = pi k if odd yields (2) and 



the r/fc-measurability of f k immediately from Lemma 2.1| Since v 2 i vanishes 
outside Dom/a, from v = Y^T=i u k we derive v 2i = {v - J2 k ^ 2i z/ fc)|Dom/ 2i - 
For k even, z/*. vanishes outside Dom/fc C /fc, while for k odd, z/fc vanishes 
outside Ran/fc C which is disjoint from Dom/2i unless k — 2i + 1. 

Thus 772; = (z/ — ^2i+i)|Dom/ 2 r The formula (|3]) for 7721-1 follows from similar 
considerations. 

It remains to show the representation — ((3]) specifies (jkiVk) uniquely 
for all k > 1, and hence determines 7 = 5^7fc uniquely. If the system has 
N < 00 limbs, I k = for k > N and hence 7fc = 0. We can compute r\ k 
and 7fc starting with k = N, and then recursively from the formulae above 
for k = N — 1,N — 2, . . . , 1, so the formulae represent 7 uniquely. If instead 
S has countably many limbs, suppose there are two finite Borel measures 7 
and 7 vanishing outside of S and having the same marginals /1 and v. For 
each k > 1, recall that 



K, 



Graph(/fc) k odd, 
Antigraphy) k even, 



is measurable with respect to both 7 and 7- Given e > 0, take N large enough 
so that both 7 and 7 assign mass less than e to U^jy-zYfc. Set 7fc = 7\n; k 
and % = 7|x fe and denote their marginals by {n^Vk) — (^#7fc>^#7fc) and 
(P>k, Vk) = n#7Jfe)- Observe that both 7' := £] fc=1 7 fc and 7 e := Y,k=i 7k 

are concentrated on the same numbered limb system; it has finitely many 
limbs, and the differences <5// = ^2 k=1 (flk — A*fe) and 5u e = ^2k=i(^ k — v k ) 
between the marginals of Y and 7 e have total variation at most 2e. Since 
the 5^2i-\ = P-n-i — A*2i-i are mutually singular, as are the bv 2i = v 2i — u 2 i, 
we find the sum of the total variations of 

f /2 fc - /ifc odd, 
\ z/fc - z/fc fc even, 

is bounded: J2k=i ll^%i|TV(Dom/ fe ) < 4e. Using (2) to derive 

n- H f x /*)#^fc||iv(xx¥) fc odd, 

hk - Ikhvix^ - \ mxidY ) MTV{XKY) fceven, 

fc||rV(Dom/ fc ) 
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and summing on k yields ||7 e — YWtv(XxY) < 4e. Since 7 e — > 7 and 7 e 7 
as e —j- 0, we conclude 7 = 7 to complete the uniqueness proof. ■ 

As in Hestir and Williams [15], the uniqueness theorem above implies 
extremality as an immediate consequence. 



Corollary 3.3 (Sufficient condition for extremality) Let X and Y be 

subsets of complete separable metric spaces, equipped with a-finite Borel mea- 
sures fi on X and v on Y '. Suppose there is a numbered limb system S = 
Graph(/2i-i) U Antigraph(/ 2i ) with the property that Graph(/ 2 «-i) and 
Antigraphy*) are 7 -measurable subsets of X x Y for each i > 1, for every 
7 G v) vanishing outside of S. If the system has finitely many limbs or 
[i[X] < 00, then any measure 7 G r(/i, v) vanishing outside of S is extremal 
in the convex set r(/z, v). 



Proof. Suppose a measure 7 G r(/i, v) vanishes outside a numbered limb 
system S satisfying the hypotheses of the corollary. If 7 = (1— i)7o+i7i with 
7o,7i G v) and < t < 1, then 7 > 70 and 7 > 71, so both 70 and 71 
vanish outside of S. According to Theorem 3.2 , they are uniquely determined 
by S and their marginals, hence 70 = 71 to establish the corollary. ■ 

The following example confirms that a measurability gap still remains 
between the necessary and sufficient conditions for extremality. It is a close 
variation on the standard example of a non-Lebesgue measurable set from 
real analysis. Together with the lemma and theorem preceding, this example 
makes clear that measurability is required only to allow the graphs to be 
separated from each other and from the antigraphs in an additive way. 



Example 3.4 (An acyclic set supporting non-extremal measures) 

Let A denote Lebesgue measure and define the maps fo(x) = x and fi{x) = 
x + a/2 (mod 1) on the unit interval X — Y — [0, 1]. Notice Graph(/i) C 
[0, l] 2 supports the doubly stochastic measure 7j = (id x fi)#\ for i = and 
i = 1; (both measures are extremal in T(A, A) by Corollary 3.3). Irrationality 
of y/2 implies S = Graph(/ ) U Graph (/1) is an acyclic set, hence can be 
expressed as a numbered limb system according to Hestir and Williams [15]. 
On the other hand, there are doubly stochastic measures such as 7 := ^(70 + 
71) which vanish outside of S but which are manifestly not extremal. 



11 



4 Uniqueness of optimal transportation 



In this section we illustrate the significance of the foregoing results by ap- 
plying them to the uniqueness question for optimal transportation on mani- 
folds. Given subsets X and Y of complete separable metric spaces equipped 
with Borel probability measures, representing the distributions \i of produc- 
tion on X and v of consumption on Y, the Kantorovich-Koopmans [16] [19] 
transportation problem is to find 7 G v) correlating production with 
consumption so as to minimize the expected transportation cost 



against some continuous function c G C(X x Y). Hereafter we shall be 
solely concerned with the case in which X is a differentiable manifold, \i is 
absolutely continuous with respect to coordinates on X, and the cost function 
c G C 1 (X x Y) is differentiable with local control on the magnitude of its 
x-derivative d x c(x, y) uniformly in y; for convenience we also suppose Y to be 
a differentiable manifold and c is bounded, though though this is not really 
necessary: substantially weaker assumptions also suffice [8]. 

In this setting one immediately asks whether the infimum Q is uniquely 
attained. Since attainment is evident, the question here is uniqueness. If c 
satisfies a twist condition, meaning x G X — > c(x, y{) — c(x, 2/2) has no criti- 
cal points for y\ 7^ 2/2 G Y, then not only is the minimizing 7 unique, but its 
mass concentrates entirely on the graph of a single map f\ : X — > Y (a num- 
bered limb system with one limb), thus solving a form of the transportation 
problem posed earlier by Monge [2S] [L7] • This was proved in comparable gen- 
erality by Gangbo [12], Carlier [7J, Levin [20], and Ma, Trudinger and Wang 
[23], building on the more specific examples of strictly convex cost functions 
c(x, y) = h(x — y) in X = Y = R n analyzed by Caffarelli [6] and Gangbo 
and McCann |13j . and in case h(x) = \x\ 2 by Abdellaoui and Heinich, Bre- 
nier, Cuesta-Albertos, Matran, and Tuero-Diaz, Cullen and Purser, Knott 
and Smith, and Riischendorf and Rachev; see [5] [T3] [21] • Adding further 
restrictions beyond this twist hypothesis allowed Ma, Trudinger, Wang, and 
later Loeper, to develop a regularity theory for the map f\ : X — > Y, em- 
bracing Delanoe, Caffarelli and Urbas' results for the quadratic cost, Gangbo 
and McCann's for its restriction to to convex surfaces, and Wang's the re- 
flector antenna design, which involves the restriction of c(x, y) — — log \x — y\ 
to the sphere; references may be found in [18] [31]. Unfortunately, the twist 



inf / 




(4) 
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hypothesis, also known as a generalized Spence-Mirrlees condition in the eco- 
nomic literature, cannot be satisfied for smooth costs c on compact manifolds 
X x Y, and apart from the result we are about to discuss there are no gen- 
eral theorems which guarantee uniqueness of minimizer to Q in this setting. 
With this in mind, let us state our main theorem, a version of which was 
established in a more complicated economic setting by Chiappori, Nesheim, 
and McCann [8]. We expect the simpler formulation and argument given 
below to prove more interesting and accessible to a mathematical readership. 

Theorem 4.1 (Uniqueness of optimal transport on manifolds) 

Let X and Y be complete separable manifolds equipped with Borel probability 
measures // on X and v onY . Let c G C 1 (X xY) be a bounded cost function 
such that for each y\ ^ yi G Y, the map 

xeX — > c(x,yi) - c(x,y 2 ) (5) 

has no critical points, save at most one global minimum and at most one 
global maximum. Assume d x c(x,y) is locally bounded in x, uniformly in Y. 
If n is absolutely continuous with respect to coordinate measure on X , then 
the minimum is uniquely attained; moreover, the minimizer 7 G T{fj,,u) 
vanishes outside a numbered limb system having at most two limbs. 

Proof. Here we give only the proof that there is a numbered limb system 
having at most two limbs, outside of which the mass of all minimizers 7 
vanishes. A detailed argument confirming the plausible fact that the graphs 
of these limbs are Borel subsets of X x Y can be found in [8] . Uniqueness of 



7 then follows from Theorem 3.2 



By linear programming duality due to Kantorovich and Koopmans in this 
context, it is well-known (31] that there exist potentials q G L 1 {X, dpi) and 
r G L 1 (Y, dv) with 

q[x) = inf c(x,y) — r{y) (6) 

such that 

inf / c(x,y)d-y(x,y) = / q(x)dfj,(x) + / r{y)dv{y). (7) 

76rV,t/) JxxY Jx JY 

From (|6| we see 

c(x, y) - q{x) - r{y) > 0, (8) 
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while ([7]) implies any minimizer 7 G r(//, z/) vanishes outside the zero set 
Z C X x K of the non-negative function appearing in (J8|. It remains to 
show this set Z is contained in a numbered limb system consisting of at most 
two limbs (apart from a fi <S> ^ negligible set). 

From ([6]), 9 is locally Lipschitz, since d x c(x,y) is controlled locally in x, 
independently of y G Y. Rademacher's theorem therefore combines with 
absolute continuity of /i to imply q is differentiate //-almost everywhere; we 
can safely ignore any points in X where differentiability of q fails, since they 
constitute a set of zero volume: 7[DomDg x Y] = fi[DomDq\ = 1. Taking 
x G DomDg, suppose (x >2/i) an d (x ,y 2 ) both lie in Z, hence saturate the 
inequality Then d x c(x , yi) = Dq(x ) = d x c(x ,y 2 ). In case the cost 
is twisted, meaning ^ has no critical points, we conclude y\ = y 2 hence 
Z H (Dom Dq x Y) is contained in a graph. This completes the proofs by 
Gangbo, Carlier, and Ma-Trudinger-Wang, of existence (and uniqueness) of 
a solution yi = fi(x ) to Monge's problem, pairing almost every x G X with 



a single yi G Y. Notice uniqueness follows from Lemma 2.1 without further 
measurability assumptions. 

In the present setting, however, we only know that Xq must be a global 
minimum or global maximum of the function ([5]). Exchanging y\ with y 2 if 
necessary yields 

q(x) < c(x, yi) - r{yx) < c(x, y 2 ) - r(y 2 ) (9) 

for all x G X, the second inequality being strict unless x = x$, in which case 
both inequalities are saturated. Strictness of inequality ^ implies (x,y 2 ) ^ 
Z unless In other words, (x,y 2 ) G Z lies on the antigraph of a 

function f 2 (y 2 ) = 20 well-defined at y 2 . There may or may not be a point 
yo G Y different from yi such that 

q(x) < c(x, y ) - r(y ) < c(x, y x ) - r{y x ) 

for all x G X. If such a point y exists, then (xo,yi) G Antigraph(/ 2 ) as 
above. If no such y exists, setting /i(x ) := yi yields Z n (DomDq x Y) C 
Graph(/!) U Antigraph(/ 2 )- Since the range of f\ is disjoint from the domain 
of f 2 , this completes the proof that — up to 7-negligible sets — Z lies in a 
numbered limb system with at most two limbs, as desired. ■ 

Let us conclude by recalling an example of an extremal doubly stochas- 
tic measure which does not lie on the graph of a single map, drawn from 
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work of Gangbo and McCann [J3] and Ahmad [1] on optimal transporta- 
tion, and developed in an economic context by Chiappori, McCann, and 
Nesheim [8j. Other examples may be found in the work of Seethoff and Shi- 
flett [28], Losert [22], Hestir and Williams [15], Gangbo and McCann [13], 
Uckelmann [2H], McCann [23], and Plakhov [2"B] . 

Imagine the periodic interval X = Y = R/27rZ = [0, 2n[ to parame- 
terize a town built on the boundary of a circular lake, and let probability 
measures \i and v represent the distribution of students and available places 
in schools, respectively. Suppose the distribution of students is smooth and 
non- vanishing but peaks sharply at the northern end of the lake, and the 
distribution of schools is smooth and non-vanishing but peaks sharply at the 
southern end of the lake. If the cost of transporting a student residing at 
location 6 G [0, 2ir] to school at location (p G [0, 2ir] is presumed to be given 
in terms of the angle commuted by c(8, <ft) = 1 — cos(# — <p), the most effective 
pairing of students with places in schools is given by the measure in r(/i, v) 
which attains the minimum: 

min / c(6,<f>) drf(0,<t>). (10) 
yeV{ti,v) JxxY 

According to results of Gangbo and McCann [T3], this minimizer is unique, 
and its support is contained in the union of the graphs of two maps t : 
X — > Y. A schematic illustration is given in Figure |4| where the restric- 
tion of the support to the subsets marked by ± on the flat torus X x Y 
represent graph(t + ) and graph(t~) respectively. The dotted lines mark 
4> — 9 = ±|,±^. The necessary positivity of -y[Jx x Jyi) > in this 
picture may be explained by observing that although it is cost-effective for 
all students to attend a school where they live, this is incompatible with the 
concentration of students at the north end of the lake, and of schools at the 
south end. Once this imbalance is corrected by sending a sufficient number of 
northern students to southern schools by the map t~, the remaining students 
can be assigned to school near their home using the map t + . Periodicity of 
graphs on the flat torus can be used to represent the support as a numbered 
limb system in more than one way; see Figure [5j which exploits the fact that 
the support of 7 in Figure [4] intersects X x Jy 2 in a graph and X x (Y — Jyi) 
in an anti-graph. 

Chiappori, Nesheim and McCann [8| called the uniqueness hypothesis 
limiting the number of critical points to at most one maximum and and at 
most one minimum in rtBl) the subtwist condition. Although it is satisfied 
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in the example above, it is an unfortunate fact that the subtwist condition 
cannot be satisfied by any smooth function c(9, <p) on a product of manifolds 
X x Y with more complicated Morse structures than the sphere. It is an 
interesting open problem to find a criterion on a smooth cost c(9, <p) on 
X = Y = R 2 /Z 2 which guarantees uniqueness of the minimum (10) for 
all smooth densities /i and v on the torus. Although we expect such costs 
to be generic, not a single example of such a cost is known to us. Hestir 
and Williams criteria for extremality seems likely to remain relevant to such 
questions, and it is natural to conjecture that the complexity of the Morse 
structure of the manifold X plays a role in determining the required number 
of limbs in the system. 




Figure 4: Schematic support of the optimal measure from the example 
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Figure 5: Two different numbered limb systems which represent Figure |4| 
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